00:17
2026-06-20
modal.com
large-language-models
Speculation Is All You Need
Modal Labs released state-of-the-art DFlash speculators for Qwen 3.5 and Qwen 3.6 models on Hugging Face, achieving 5-20% additional speedups and enabling Qwen 3.5 122B-A10B to run at over 1000 tok/s โฆ